Goto

Collaborating Authors

 network traffic classification


Network Traffic Classification Using Machine Learning, Transformer, and Large Language Models

Antari, Ahmad, Abo-Aisheh, Yazan, Shamasneh, Jehad, Ashqar, Huthaifa I.

arXiv.org Artificial Intelligence

This study uses various models to address network traffic classification, categorizing traffic into web, browsing, IPSec, backup, and email . We collected a comprehensive dataset from Arbor Edge Defender (AED) devices, comprising of 30,959 observations and 19 features. Multiple models were evaluated, including Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, XGBoost, Deep Neural Networks (DNN), Transformer, and two Large Language Models (LLMs) including GPT - 4o and Gemini with zero - and few - shot learning. Transformer and XGBoost showed the best performance, achieving the highest accuracy of 98.95 and 97.56%, respectively . GPT - 4o and Gemini showed promising results with few - shot learning, improving accuracy significantly from initial zero - shot performance. While Gemini Few - Shot and GPT - 4 o Few - Shot performed well in categories like Web and Email, misclassifications occurred in more complex categories like IPSec and Backup. The study highlights the importance of model selection, fine - tuning, and the balance between training data siz e and model complexity for achieving reliable classification results.


Improving the network traffic classification using the Packet Vision approach

Moreira, Rodrigo, Rodrigues, Larissa Ferreira, Rosa, Pedro Frosi, Silva, Flávio de Oliveira

arXiv.org Artificial Intelligence

The network traffic classification allows improving the management, and the network services offer taking into account the kind of application. The future network architectures, mainly mobile networks, foresee intelligent mechanisms in their architectural frameworks to deliver application-aware network requirements. The potential of convolutional neural networks capabilities, widely exploited in several contexts, can be used in network traffic classification. Thus, it is necessary to develop methods based on the content of packets transforming it into a suitable input for CNN technologies. Hence, we implemented and evaluated the Packet Vision, a method capable of building images from packets raw-data, considering both header and payload. Our approach excels those found in state-of-the-art by delivering security and privacy by transforming the raw-data packet into images. Therefore, we built a dataset with four traffic classes evaluating the performance of three CNNs architectures: AlexNet, ResNet-18, and SqueezeNet. Experiments showcase the Packet Vision combined with CNNs applicability and suitability as a promising approach to deliver outstanding performance in classifying network traffic.


Network Anomaly Traffic Detection via Multi-view Feature Fusion

Hao, Song, Fu, Wentao, Chen, Xuanze, Jin, Chengxiang, Zhou, Jiajun, Yu, Shanqing, Xuan, Qi

arXiv.org Artificial Intelligence

Traditional anomalous traffic detection methods are based on single-view analysis, which has obvious limitations in dealing with complex attacks and encrypted communications. In this regard, we propose a Multi-view Feature Fusion (MuFF) method for network anomaly traffic detection. MuFF models the temporal and interactive relationships of packets in network traffic based on the temporal and interactive viewpoints respectively. It learns temporal and interactive features. These features are then fused from different perspectives for anomaly traffic detection. Extensive experiments on six real traffic datasets show that MuFF has excellent performance in network anomalous traffic detection, which makes up for the shortcomings of detection under a single perspective.


Time-Distributed Feature Learning for Internet of Things Network Traffic Classification

Manjunath, Yoga Suhas Kuruba, Zhao, Sihao, Zhang, Xiao-Ping, Zhao, Lian

arXiv.org Artificial Intelligence

Deep learning-based network traffic classification (NTC) techniques, including conventional and class-of-service (CoS) classifiers, are a popular tool that aids in the quality of service (QoS) and radio resource management for the Internet of Things (IoT) network. Holistic temporal features consist of inter-, intra-, and pseudo-temporal features within packets, between packets, and among flows, providing the maximum information on network services without depending on defined classes in a problem. Conventional spatio-temporal features in the current solutions extract only space and time information between packets and flows, ignoring the information within packets and flow for IoT traffic. Therefore, we propose a new, efficient, holistic feature extraction method for deep-learning-based NTC using time-distributed feature learning to maximize the accuracy of the NTC. We apply a time-distributed wrapper on deep-learning layers to help extract pseudo-temporal features and spatio-temporal features. Pseudo-temporal features are mathematically complex to explain since, in deep learning, a black box extracts them. However, the features are temporal because of the time-distributed wrapper; therefore, we call them pseudo-temporal features. Since our method is efficient in learning holistic-temporal features, we can extend our method to both conventional and CoS NTC. Our solution proves that pseudo-temporal and spatial-temporal features can significantly improve the robustness and performance of any NTC. We analyze the solution theoretically and experimentally on different real-world datasets. The experimental results show that the holistic-temporal time-distributed feature learning method, on average, is 13.5% more accurate than the state-of-the-art conventional and CoS classifiers.


Deep Learning Approaches for Network Traffic Classification in the Internet of Things (IoT): A Survey

Kalwar, Jawad Hussain, Bhatti, Sania

arXiv.org Artificial Intelligence

The Internet of Things (IoT) has witnessed unprecedented growth, resulting in a massive influx of diverse network traffic from interconnected devices. Effectively classifying this network traffic is crucial for optimizing resource allocation, enhancing security measures, and ensuring efficient network management in IoT systems. Deep learning has emerged as a powerful technique for network traffic classification due to its ability to automatically learn complex patterns and representations from raw data. This survey paper aims to provide a comprehensive overview of the existing deep learning approaches employed in network traffic classification specifically tailored for IoT environments. By systematically analyzing and categorizing the latest research contributions in this domain, we explore the strengths and limitations of various deep learning models in handling the unique challenges posed by IoT network traffic. Through this survey, we aim to offer researchers and practitioners valuable insights, identify research gaps, and provide directions for future research to further enhance the effectiveness and efficiency of deep learning-based network traffic classification in IoT.


Generative Adversarial Classification Network with Application to Network Traffic Classification

Ghanavi, Rozhina, Liang, Ben, Tizghadam, Ali

arXiv.org Artificial Intelligence

Large datasets in machine learning often contain missing data, which necessitates the imputation of missing data values. In this work, we are motivated by network traffic classification, where traditional data imputation methods do not perform well. We recognize that no existing method directly accounts for classification accuracy during data imputation. Therefore, we propose a joint data imputation and data classification method, termed generative adversarial classification network (GACN), whose architecture contains a generator network, a discriminator network, and a classification network, which are iteratively optimized toward the ultimate objective of classification accuracy. For the scenario where some data samples are unlabeled, we further propose an extension termed semi-supervised GACN (SSGACN), which is able to use the partially labeled data to improve classification accuracy. We conduct experiments with real-world network traffic data traces, which demonstrate that GACN and SS-GACN can more accurately impute data features that are more important for classification, and they outperform existing methods in terms of classification accuracy.